ST-MVL: Filling Missing Values in Geo-Sensory Time Series Data
نویسندگان
چکیده
Many sensors have been deployed in the physical world, generating massive geo-tagged time series data. In reality, readings of sensors are usually lost at various unexpected moments because of sensor or communication errors. Those missing readings do not only affect real-time monitoring but also compromise the performance of further data analysis. In this paper, we propose a spatio-temporal multiview-based learning (ST-MVL) method to collectively fill missing readings in a collection of geosensory time series data, considering 1) the temporal correlation between readings at different timestamps in the same series and 2) the spatial correlation between different time series. Our method combines empirical statistic models, consisting of Inverse Distance Weighting and Simple Exponential Smoothing, with data-driven algorithms, comprised of Userbased and Item-based Collaborative Filtering. The former models handle general missing cases based on empirical assumptions derived from history data over a long period, standing for two global views from spatial and temporal perspectives respectively. The latter algorithms deal with special cases where empirical assumptions may not hold, based on recent contexts of data, denoting two local views from spatial and temporal perspectives respectively. The predictions of the four views are aggregated to a final value in a multi-view learning algorithm. We evaluate our method based on Beijing air quality and meteorological data, finding advantages to our model compared with ten baseline approaches.
منابع مشابه
AAAI Proceedings Template
Many sensors have been deployed in the physical world, generating massive geo-tagged time series data. In reality, readings of sensors are usually lost at various unexpected moments because of sensor or communication errors. Those missing readings do not only affect real-time monitoring but also compromise the performance of further data analysis. In this paper, we propose a spatio-temporal mul...
متن کاملAvoid Filling Swiss Cheese with Whipped Cream: Imputation Techniques and Evaluation Procedures for Cross-Country Time Series; by Michaela Denk, Michael Weber; IMF Working Paper 11/151; June 1, 2011
International organizations collect data from national authorities to create multivariate cross-sectional time series for their analyses. As data from countries with not yet wellestablished statistical systems may be incomplete, the bridging of data gaps is a crucial challenge. This paper investigates data structures and missing data patterns in the crosssectional time series framework, reviews...
متن کاملMissing data imputation in multivariable time series data
Multivariate time series data are found in a variety of fields such as bioinformatics, biology, genetics, astronomy, geography and finance. Many time series datasets contain missing data. Multivariate time series missing data imputation is a challenging topic and needs to be carefully considered before learning or predicting time series. Frequent researches have been done on the use of diffe...
متن کاملPaper SD16 Missing Data Values: Analyzing their Effects on Rainfall Forecasts Using PROC EXPAND and the SAS Time-Series Forecasting System
Missing values are a common problem faced in the analysis of time-series data. Many SAS® time-series procedures (PROC ARIMA, PROC VARMAX, etc.) are intolerant of missing values, particularly when these missing values are embedded in the time-series, rather than occurring at the beginning or end of the series. PROC EXPAND is designed to convert time series from one sampling interval or frequency...
متن کاملپیشبینی سری زمانی تعداد معلولیتهای مربوط به حوادث ناشی از کار برای بیمه شدگان تأمین اجتماعی بین سالهای 1379 تا 1389 در ایران با استفاده از روش تحلیل باکس جنکینز
Background : Controlling occurrence of accidents in work place has been an interesting subject in all countries worldwide. Financial consequences of these accidents and their economic losses imposed on the involved companies is only one of the insignificant aspects of such damages and when the non-economic but intangible losses to the society are taken into consideration ,these economic damag...
متن کامل